Chromatin Immunoprecipitation Sequencing    ◾    225

These two files are the largest and they are in bedGraph format that can be visualized in the

UCSC browser. The bedGraph format is a format developed to display genomic informa-

tion in a track on the genomic browser. It consists of four tab-separated columns: chromo-

some, start, end, and value. The chromosome coordinates are 0-based. The positions (start

and end) are listed in ascending order. The data displayed in the track are the values in the

value column and they can be integer or real, positive or negative.

BED format: The “*_summits.bed” file is in BED format. The BED (Browser Extensible

Data) format defines the data lines that are viewed in an annotation track of a genomic

browser. The BED file contains three required fields (chromosome, start, and end) and nine

additional optional fields (name, score, strand, thick start, thick end, RGB (an RGB color

value), block count, block size, and block start).

The “summits.bed” file contains the summit locations for every peak. The fifth column

in this file is the same as what is in the narrowPeak file. This file can be used for finding

motifs at the binding sites.

R script file: The “*_model.r” file is an R script to produce a PDF file containing peak

model plot and cross-correlation plot. The pdf files for our ChIP-Seq data are generated

using the “Rscript” command as follows:

Rscript chip1_model.r

Rscript chip2_model.r

Rscript chip3_model.r

BED6+4 format: The “*_peaks.narrowPeak” file is in BED6+4 format, which stores infor-

mation about signal enrichment of the called peaks based on pooled, normalized read

counts. This format consists of ten tab-separated columns: chromosome, start, end, name

(region name), score (peak density score from 0 to 1000) based on the signal value, strand

(+/-), signal value (average enrichment), p-value (int(-10*log10pvalue), FDR or q-value

FIGURE 6.4  MACS3 output files for the three ChIP-Seq data.